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DETAILED ACTION 

1 . This is responsive to the amendment after non-final filed on 20 August 2008. 

2. Claims 1 - 29 are still pending and considered below. 

Response to Arguments 

3. Applicant's arguments with respect to claims 1 - 29 have been considered but 
are moot in view of the new ground(s) of rejection. 

Claim Objections 

4. Claim 1 is objected to because of the following informalities: in the preamble, it is 
believed, "tuning the text-to-speech conversion process" should be 'tuning a text-to- 
speech conversion process' (emphasis added). 

5. Claim 20 is objected to because of the following informalities: it is believed it 
should depend on claim 19 so "the parameterized aligned sound records formant" can 
have a proper antecedent basis. 

Claim Rejections - 35 USC § 102 

6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 
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7. Claims 1 - 7, 10, and 12 - 29 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Miyatake (USPN 5,842,167). 
Claim 1: 

Miyatake discloses a system fortuning a text-to-speech conversion process 
(Abstract), the system comprising: 

a text-to-speech engine (Fig. 1, item 4 and related text), said text-to-speech 
engine receiving at least one text-input and converting said text-input into a processed 
representation (prosodic data, col. 4, lines 8-11), said processed representation 
including at least one segment (synthesis unit, col. 4, lines 8-11) and a word- 
boundary (Fig. 4, dashed lines between synthesis units), each associated with at least 
one speech feature, wherein said at least one speech feature of said word-boundary 
includes boundary strength and pause duration (Fig. 4, PAUSE, and related text. Note 
that the boundary strength is represented by the distance between the dashed lines 
between synthesis units); and 

a visual editing interface (Fig. 1, item 5 and related text) displaying said 
processed representation using at least one graphical indicator including displayed 
segments and displayed boundaries on an output device (Fig. 4 and related text), 
wherein said visual editing interface displays and allows editing of said at least one 
speech feature of said word-boundary by editing (a) a displayed boundary and (b) 
spacing between a displayed segment and said displayed boundary ("the displayed 
characters are edited as shown in FIG. 4, the speech 
synthesizing means 6 inserts pauses at the beginning and the end 
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of "hai", which have wider character spacings", col. 4, lines 45-50. See 
also Figs. 3, 4, and related text). 
Claim 2: 

Miyatake discloses the system of claim 1 wherein said visual editing interface 
provides at least one editing function to a user, the editing function enabling 
modification of said speech feature associated with said segment through a change in 
the corresponding said graphical indicator (col. 2, lines 35-51). 

Claim 3: 

Miyatake discloses the system of claim 2 wherein said visual editing interface 
associates said speech feature corresponding to said segment with said graphical 
indicator, wherein the user's modification of said graphical indicator results in a 
corresponding change in said speech feature of said segment (col. 2, lines 35-51). 

Claim 4: 

Miyatake discloses the system of claim 1 wherein said speech feature is at least 
one of the following: normalized text, part-of-speech, parsing of text, chunking of text, 
boundary strength, pause duration, transcription, speech rate, syllable duration, 
segment duration, pitch, word prominence, emphasis, formant mixing mode, unit 
selection override, intensity contour, formant trajectories, and allophone rules (Fig. 4, 
PAUSE, and related text). 

Claim 5: 

Miyatake discloses the system of claim 1 wherein said graphical indicator 
comprises at least one of the following: graphical style, font faces, coloring, vertical 
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spacing, horizontal spacing, italicization, boldness, underlining, blinking, crossing-out, 
text orientation, text rotation, punctuation symbols and graphical symbols (col. 3, lines 
24-34). 

Claim 6: 

Miyatake discloses the system of claim 1 wherein said processed representation 
employs a parameterized aligned sound records format (col. 3, lines 35-45). 
Claim 7: 

Miyatake discloses the system of claim 1 wherein said segment comprises at 
least one of the following: word, letter, syllable, pause, word boundary and punctuation- 
mark (Fig. 4, hai, and related text). 

Claim 10: 

Miyatake discloses the system of claim 1 wherein said visual editing interface 
allows definition of said input-text by providing a set of text messages containing non- 
editable text and editable blank slots into which at least part of said input-text can be 
entered (col. 5, lines 22-24. Note that text on the word processor's inherent icons is non- 
editable). 

Claim 12: 

Miyatake discloses the system of claim 1 wherein said visual editing interface 
provides the user with speech audio output of said processed representation (Fig. 1, 
item 8 and related text). 

Claim 13: 
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Miyatake discloses the system of claim 1 wherein visual editing interface is 
connected to a data-store for storing and retrieving said representation (Fig. 1 , item 4 
and related text. Note that data must be stored in order to be retrieved). 

Claims 14 - 16: 

Miyatake discloses the system of claim 1 wherein the said processed 
representation is a modified textual representation, wherein the input text is used to 
generate said processed representation (Fig. 4 and related text), and wherein said 
modified textual representation is stored and accessed from a data store (Fig. 1, item 4 
and related text. Note that data must be stored in order to be retrieved). 

Claim 17: 

Miyatake discloses the system of claim 14 wherein said textual representation is 
used to generate synthesized speech using a TTS system (Fig. 1, item 6 and related 
text) distinct from said text-to-speech engine (Fig. 1 , item 4 and related text). 

Claim 18: 

Miyatake discloses a system for providing a text-to-speech interface (Abstract), 
the system comprising: 

a visual interface connected to a text-to-speech engine (Fig. 1, item 5 and related 
text); and 

at least one communication channel connecting said visual interface to said text- 
to-speech engine (Fig.1, communication link between items 5 and 6), said text-to- 
speech engine communicating with said visual interface over said communication 
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channel by sending and receiving at least one data segment in a format (Fig. 1 , item 6 
and related text), 

wherein said visual interface communicates variations in one or more types of 
speech features associated with segments of said data by varying visual display 
properties of the segments (col. 3, lines 23-34), at least one of said speech features 
includes boundary strength and pause duration, and said visual display properties are 
applied to at least one of (a) a displayed boundary between adjacent segments and (b) 
spacing between a segment and said displayed boundary ("the displayed 
characters are edited as shown in FIG. 4, the speech 
synthesizing means 6 inserts pauses at the beginning and the end 
of "hai", which have wider character spacings", col. 4, lines 45-50. See 
also Figs. 3, 4, and related text). 

Claim 19: 

Miyatake discloses the system of claim 18 wherein said format of said data 
segment is a parameterized aligned sound records format (col. 3, lines 35-45). 
Claim 20: 

Miyatake discloses the system of claim 19 wherein said text-to-speech engine 
sends said data segment in the parameterized aligned sound records format to said 
visual interface, said visual interface rendering said data segment in a visual form, said 
visual interface allowing editing of said data segment to produce an edited data 
segment, said visual interface sending said edited data segment to said text-to-speech 
engine (col. 3, lines 35-45). 
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Claim 21: 

Miyatake discloses the system of claim 18 wherein said visual interface sends 
data to said text-to-speech engine over a first said communication channel and said 
text-to-speech engine sends data to said visual interface over a second said 
communication channel (Fig. 1, items 5, 6, and related text). 

Claim 22: 

Miyatake discloses a method for visual tuning text-to-speech conversion process, 
the method comprising: 

converting an input-text to a processed representation using a text-to-speech 
engine, said processed representation including at least one speech feature of said 
input-text (col. 3, lines 19-23); 

displaying said processed representation on a visual editing interface connected 
to said text-to-speech engine, said speech feature of said processed representation 
being displayed in a corresponding graphical form (col. 3, lines 35-45); 

communicating variations in one or more types of speech features associated 
with segments of said representation by varying visual display properties of the 
segments (col. 3, lines 24-35), wherein said speech features include boundary strength 
and pause duration, and said visual display properties are applied to at least one of (a) 
a displayed boundary between adjacent segments and (b) spacing between a segment 
and said displayed boundary ("the displayed characters are edited as 
shown in FIG. 4, the speech synthesizing means 6 inserts pauses 
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at the beginning and the end of "hai", which have wider 
character spacings", col. 4, lines 45-50. See also Figs. 3, 4, and related text); and 

providing an editing function in said visual editing interface to a user for modifying 
said speech feature in said graphical form (col. 2, lines 35-51). 

Claim 23: 

Miyatake discloses the method of claim 22 further comprising: generating speech 
audio equivalent of said processed representation through said visual editing interface 
(col. 3, lines 35-53). 

Claim 24: 

Miyatake discloses the method of claim 22 further comprising: saving said 
processed representation in a data store; and loading said processed representation 
stored in said data store into said visual editing interface (Fig. 1, item 4 and related text. 
Note that data must be stored in order to be retrieved). 

Claim 25: 

Miyatake discloses the method of claim 22 further comprising: converting said 
processed representation into a modified textual representation of the processed input- 
text (col. 3, lines 35-53). 

Claim 26: 

Miyatake discloses the method of claim 25 further comprising: converting said 
textual representation into a processed representation, wherein the input text is used to 
generate said processed representation (Fig. 4 and related text). 

Claim 27: 
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Miyatake discloses the method of claim 25 further comprising: storing said 
modified textual representation in a data store; and loading said modified textual 
representation stored in said data store into said visual editing interface (Fig. 1, item 4 
and related text. Note that data must be stored in order to be retrieved. See also Fig. 4 
and related text). 

Claim 28: 

Miyatake discloses the method of claim 25 further comprising: using said 
modified textual representation to synthesize speech using a TTS system (Fig. 1, item 6 
and related text) distinct from said text-to-speech engine (Fig. 1 , item 4 and related 
text). 

Claim 29: 

Miyatake discloses the system of claim 1, wherein said visual editing interface 
displays a modified textual representation of said text-input, and variations in visual 
display for communicating different speech features individually associated with 
different textual segments of the textual representation include a combination of at least 
two of: (a) variations in graphical length of the textual segments; (b) variations in vertical 
positions of the textual segments; (c) variations in horizontal spacing of the textual 
segments; (d) variations in font faces of the textual segments; (e) variations in coloring 
of the textual segments; (f) variations in styles of the textual segments; (g) variations in 
orientation of the textual segments; (h) variations in rotation of the textual segments; or 
(i) punctuation of the textual segments (col. 4, lines 55-67). 
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Claim Rejections - 35 USC § 103 

8. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

9. Claims 8, 9, and 11 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Miyatake (USPN 5,842,167) in view of Kobal et al (USPN 7,099,828). 

Claims 8 and 9: 

Miyatake discloses the system of claim 1 but does not explicitly disclose wherein 
said visual editing interface operates as a plug-in for a graphical user interface wherein 
said plug-in is an ActiveX control. 

In a similar TTS system where a user can specify the pronunciation for a given 
text, Kobal discloses a visual editing interface as a standalone tool (col. 3, lines 46-49). 
In addition, ActiveX controls are reusable software components developed in the 1990's 
by Microsoft to enable enhanced formatting of web pages. 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to have operated Miyatake's editing interface as an ActiveX plug-in because 
ActiveX technology adds interactivity and more functionality, such as animation or pop- 
up menu, and can be written in a number of software languages. 

Claim 11: 

Miyatake discloses the system of claim 1 but does not explicitly disclose wherein 
said visual editing interface is language independent. 
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In a similar TTS system where a user can specify the pronunciation for a given 
text, Kobal discloses a visual editing interface that is language independent (col. 5, lines 
60-67). 

It would have been obvious to one with ordinary skill in the art at the time of the 
invention to have made Miyatake's visual editing interface language independent in 
order to use Miyatake's system in different parts of the world where different languages 
are used (Kobal, col. 3, lines 30-32). 

Conclusion 

1 0. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See M PEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to SAMUEL G. NEWAY whose telephone number is 
(571)270-1058. The examiner can normally be reached on Monday - Friday 8:30AM - 
5:30PM EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R Hudspeth can be reached on 571-272-7843. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/David R Hudspeth/ 

Supervisory Patent Examiner, Art Unit 2626 

IS. G. N./ 

Examiner, Art Unit 2626 



